1. Introduction to genetic testing and the underlying technology

Personal genetic testing is transforming the biological and medical fields as humans now have access to their genetic instructions. However, every positive such as personalized medicine, assistance in criminal investigations and increase in research are all accompanied by a negative foil. Personalized medicine is a privilege that seems to be given to those in higher socioeconomic classes, criminal investigations are riddled with privacy and consent obstacles, while research has taken advantage of minorities or has greatly catered to those with a European ancestry. There is also high concern of privacy and having such personal data being leaked and placed in the wrong hands.

DNA polymorphisms, most common which are SNPs, can be detected in genomes in various ways. In past decades, methods such as Southern blots, PCRs and hybridization techniques using microarray chips have been used for genome sequencing. Recently, DNA-based molecular markers have been a breakthrough technology that is used to detect SNPs as these markers can easily identify particular DNA sequences.

2. My genealogy and family history

I was very lucky and had very documented information about my family history on my paternal side, almost no questions were left unanswered. Our documents included birth locations and dates, death locations and deaths, occupation, relocations, any achievements or major life events. My mother’s side, however, is a complete mystery beyond just immediate family.

Geneology Tree {width = 65%}

3. DNA matches and relatives in the databases

I would expect to share around 50% of my DNA with my parents, 25% with each of my grandparents and 12.5% with each great grandparent and so on. However, it is important to distinguish the difference between genealogical ancestors and genetic ancestors as the later are the ones that I actually got some DNA from. This, of course, is regarding autosomal DNA as the sex chromosomes and maternal DNA are more directly passed down. After about eight generations back is when it is expected that the number of genetic ancestors increases linearly rather than exponentially, while the number of genealogical ancestors only begins to increase exponentially. So starting around the eight generation is where I would begin having ancestors with which I have no genetic similarity.

DNA Relatives I have around the expected amount of DNA shared with my aunt (25 % expected and 23.34 % observed) as well as with my first cousin once removed (6.25 % expected and 5.17 % observed). I share 2.5 % and 2.11 % with two of my second cousins and it is expected I share 3.13 %. I share 1.78 %, 1.58 % and 1.14 % with three of my third cousins and it is expected I share 0.78 %.

4. Ethnicity estimates

My ethnicity results were exactly what I predicted. I already knew from my paternal side that I am 50 % Ashkenazi Jewish as my paternal side of the family is 100% Ashkenazi Jewish. My maternal side of the family could potentially have had some surprises, but I got results that I expected. My 47 % Eastern European is all from my mother who believes that her family is Don Cossack, and I even expected the 2 % East Asian and Native American results as Russian history, and thus genealogy, was heavily influenced by the Mongol Empire that ruled over Russia in the 13th and 14th centuries. The German and French influence is most likely from my paternal side as throughout Jewish migration through Europe, many originally stayed in Germany. There was also 0.2 % Central Asian (Kazakhstan, Uzbekistan, Turkmenistan, etc.) trace DNA as well as 0.2 % North African and West Asian DNA. The Central Asian DNA either also came from Mongol rule over Russia, or a potential modern explanation could be that due the USSR which included the Central Asian region. I am not sure where the 0.4 % trace North African DNA comes in. I could further test my theories by looking at DNA similarities between me and some of the people 23andMe listed as possible relatives, which I did. By looking at my Aunt from my paternal side I could confirm that my Central Asian DNA mostly likely came from my Mother as my Aunt had no Central Asian DNA. She did, however, have 1 % North African and West Asian DNA which means that that those results are most likely from my Jewish Ancestors.

Ethnicity Percent Ethnicity Map

5. Older family history based on mitochondrial and Y chromosomes results

The maternal and paternal haplogroups offer approximated ancestry information from ten to hundred thousand years ago. This is because both mitochondrial DNA and Y chromosome DNA have a slower mutation rate and therefore are generally conserved. Therefore, any mutations are significant and can be traced back to hundreds of generations. Through family members that have also taken a 23andMe test I was able to gather more information about my mitochondrial and Y chromosome results.

Mother: Mitochondrial: H1u Y chromosome: ?

Father: Mitochondrial: W3 Y chromosome: E-L29

My mitochondrial haplogroup is H1u, while if I had a Y chromosome, my haplogroup would be E-L29. After further research, I discovered that it is proposed that the H1u lineage split off from other H groups around modern-day Azerbaijan, general Caucus area. This lines up with my Mother’s Don Cossacks ancestry as Don Cossacks are believed to have originated in the North Caucuses. The most common maternal haplogroup for Ashkenazi Jews is K, so it is interesting that my paternal maternal haplogroup is W which is most common in Pakistan and Northern Indian. I found some research that believes that W3 originated in the Middle East but spread to Europe around 15,000 years ago and spans across regions of Russia, to North Africa, Caucasus, the Near East, Mongolia and the Indian Subcontinent. E-L29 also originated in the Middle East about 4,000 years ago and is extremely common in Ashkenazi Jews.

6. Medically important genotypes and 7. Interesting genotypes

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.1.0     ✓ dplyr   1.0.5
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(DT)
library(gwascat)
## gwascat loaded.  Use makeCurrentGwascat() to extract current image.
##  from EBI.  The data folder of this package has some legacy extracts.
# Load Files 
mySNPs <- read_tsv("data/genome_A_G_v5_Full_20210322120133.txt", comment = '#',
col_types = 
  cols(
    rsid = col_character(),
    chromosome = col_factor(),
    position = col_integer(),
    genotype = col_factor()
  ))
updated_gwas_data <- as.data.frame(makeCurrentGwascat())
## running read.delim on http://www.ebi.ac.uk/gwas/api/search/downloads/alternative...
## formatting gwaswloc instance...
## NOTE: input data had non-ASCII characters replaced by '*'.
## Warning in which(!is.na(as.numeric(df$CHR_POS))): NAs introduced by coercion
## Warning in gwdf2GRanges(tab, extractDate = as.character(Sys.Date())): NAs
## introduced by coercion
## done.
max(updated_gwas_data$DATE.ADDED.TO.CATALOG)
## [1] "2021-04-16"
last_update <- max(updated_gwas_data$DATE.ADDED.TO.CATALOG)

filter(updated_gwas_data, DATE.ADDED.TO.CATALOG == last_update) %>% select(STUDY) %>% distinct()
##                                                                                                                                                                            STUDY
## 1                                                                    Low-frequency variation near common germline susceptibility loci are associated with risk of Ewing sarcoma.
## 2 The Genetics of Circulating Resistin Level, A Biomarker for Cardiovascular Diseases, Is Informed by Mendelian Randomization and the Unique Characteristics of African Genomes.
## 3                                                                                              Genetic Architecture of Abdominal Aortic Aneurysm in the Million Veteran Program.
## 4                                         A genome-wide association study on fish consumption in a Japanese population-the Japan Multi-Institutional Collaborative Cohort study.
## 5                                                        GWAS of peptic ulcer disease implicates Helicobacter pylori infection, other gastrointestinal disorders and depression.
## 6                                                             Genetic basis of lacunar stroke: a pooled analysis of individual patient data and genome-wide association studies.
filter(updated_gwas_data, DATE.ADDED.TO.CATALOG == last_update) %>% select(LINK) %>% distinct()
##                                   LINK
## 1 www.ncbi.nlm.nih.gov/pubmed/32881892
## 2 www.ncbi.nlm.nih.gov/pubmed/32876488
## 3 www.ncbi.nlm.nih.gov/pubmed/32981348
## 4 www.ncbi.nlm.nih.gov/pubmed/32895509
## 5 www.ncbi.nlm.nih.gov/pubmed/33608531
## 6 www.ncbi.nlm.nih.gov/pubmed/33773637
mySNPs_gwas_table <- inner_join(mySNPs, updated_gwas_data, by = c("rsid" = "SNPS"))
mySNPs_gwas_table$risk_allele_clean <- str_sub(mySNPs_gwas_table$STRONGEST.SNP.RISK.ALLELE, -1)
mySNPs_gwas_table$my_allele_1 <- str_sub(mySNPs_gwas_table$genotype, 1, 1)
mySNPs_gwas_table$my_allele_2 <- str_sub(mySNPs_gwas_table$genotype, 2, 2)
mySNPs_gwas_table$have_risk_allele_count <- if_else(mySNPs_gwas_table$my_allele_1 == mySNPs_gwas_table$risk_allele_clean, 1, 0) + if_else(mySNPs_gwas_table$my_allele_2 == mySNPs_gwas_table$risk_allele_clean, 1, 0)

There are three medical concerns that I wanted to investigate with my SNPs. Asthma, type-2 diabetes and Crohns/ associated IBD or IBS SNPS

I have a family history of type-2 diabetes. I wanted to look more into it to see what SNPs I have asscoaited with diabetes risk.

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>% 
 filter(str_detect(tolower(DISEASE.TRAIT), "diabetes")) %>%
 distinct()
## # A tibble: 367 x 4
##    rsid      DISEASE.TRAIT                            risk_allele your_geneotype
##    <chr>     <chr>                                    <chr>       <fct>         
##  1 rs121169… Type 2 diabetes                          G           AG            
##  2 rs171061… Type 2 diabetes                          G           GG            
##  3 rs121401… Type 2 diabetes                          G           GG            
##  4 rs602633  Coronary heart disease x type 2 diabete… T           GT            
##  5 rs2282456 Type 2 diabetes                          G           GG            
##  6 rs6032    Macrovascular complications in type 2 d… T           TT            
##  7 rs4077468 Cystic fibrosis-related diabetes         A           AA            
##  8 rs3024505 Type 1 diabetes                          G           GG            
##  9 rs340874  Type 2 diabetes                          C           CT            
## 10 rs340874  Type 2 diabetes                          T           CT            
## # … with 357 more rows
filter(mySNPs, rsid == "rs12255372")
## # A tibble: 1 x 4
##   rsid       chromosome  position genotype
##   <chr>      <fct>          <int> <fct>   
## 1 rs12255372 10         114808902 GG

No linked to type-2 diabetes and breast cancer

filter(mySNPs, rsid == "rs4402960")
## # A tibble: 1 x 4
##   rsid      chromosome  position genotype
##   <chr>     <fct>          <int> <fct>   
## 1 rs4402960 3          185511687 GG

No linked to type-2 diabetes

filter(mySNPs, rsid == "rs7754840")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs7754840 6          20661250 CG

No linked to type-2 diabetes

filter(mySNPs, rsid == "rs12255372")
## # A tibble: 1 x 4
##   rsid       chromosome  position genotype
##   <chr>      <fct>          <int> <fct>   
## 1 rs12255372 10         114808902 GG

no increased risk of T2D

Carrying two copies of a common variant of TCF7L2 doubles your chances of developing diabetes and puts you in a similar risk category to being clinically obese

filter(mySNPs, rsid == "rs7903146")
## # A tibble: 1 x 4
##   rsid      chromosome  position genotype
##   <chr>     <fct>          <int> <fct>   
## 1 rs7903146 10         114758349 CC

Normal (lower) risk of Type 2 Diabetes and Gestational Diabetes.

filter(mySNPs, rsid == "rs12255372")
## # A tibble: 1 x 4
##   rsid       chromosome  position genotype
##   <chr>      <fct>          <int> <fct>   
## 1 rs12255372 10         114808902 GG

no increased risk of T2D

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>% 
 filter(str_detect(tolower(DISEASE.TRAIT), "celiac")) %>%
 distinct()
## # A tibble: 17 x 4
##    rsid       DISEASE.TRAIT                          risk_allele your_geneotype
##    <chr>      <chr>                                  <chr>       <fct>         
##  1 rs13003464 Celiac disease                         G           AG            
##  2 rs10188217 Crohn's disease and celiac disease     C           CT            
##  3 rs7574865  Celiac disease or Rheumatoid arthritis T           GT            
##  4 rs4678523  Celiac disease                         C           CT            
##  5 rs11712165 Celiac disease                         G           GT            
##  6 rs6822844  Celiac disease                         G           GG            
##  7 rs424232   Celiac disease                         C           CC            
##  8 rs10806425 Celiac disease                         A           AC            
##  9 rs2041570  Refractory celiac disease type II      A           AG            
## 10 rs11984075 Celiac disease or Rheumatoid arthritis G           AG            
## 11 rs1953126  Celiac disease or Rheumatoid arthritis T           CT            
## 12 rs1250552  Celiac disease                         A           AG            
## 13 rs7104791  Celiac disease                         T           CT            
## 14 rs3184504  Celiac disease                         C           CT            
## 15 rs653178   Celiac disease or Rheumatoid arthritis C           CT            
## 16 rs2664156  Celiac disease                         C           CC            
## 17 rs11203203 Celiac disease or Rheumatoid arthritis A           AA

one of two SNPs associated with increase Crohn’s

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>% 
 filter(str_detect(tolower(DISEASE.TRAIT), "colorectal")) %>%
 distinct()
## # A tibble: 79 x 4
##    rsid     DISEASE.TRAIT                             risk_allele your_geneotype
##    <chr>    <chr>                                     <chr>       <fct>         
##  1 rs72647… Colorectal cancer                         T           TT            
##  2 rs75426… Colorectal cancer                         C           CC            
##  3 rs10920… Metastasis in stage I-III microsatellite… T           CT            
##  4 rs66911… Colorectal cancer                         T           GT            
##  5 rs17011… Colorectal cancer or advanced adenoma     G           AG            
##  6 rs66877… Colorectal cancer                         G           AG            
##  7 rs885036 Progression free survival in metastatic … A           AG            
##  8 rs21637… Colorectal cancer                         G           GG            
##  9 rs11626… Colorectal cancer                         G           GG            
## 10 rs651907 Colorectal cancer                         C           CC            
## # … with 69 more rows
filter(mySNPs, rsid == "rs16892766")
## # A tibble: 1 x 4
##   rsid       chromosome  position genotype
##   <chr>      <fct>          <int> <fct>   
## 1 rs16892766 8          117630683 AA
filter(mySNPs, rsid == "rs4779584")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs4779584 15         32994756 CC
filter(mySNPs, rsid == "rs58920878")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>

no increase of cholorectal cancer

filter(mySNPs, rsid == "rs4939827")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs4939827 18         46453463 CC

0.73x decreased risk for colorectal cancer

I have been “diagnosed” with asthma, but I have never experienced an asthma attack or have experienced troubled breathing. Therefore, I wanted to see if I have any SNPs associated with asthma.

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>% 
 filter(str_detect(tolower(DISEASE.TRAIT), "asthma")) %>%
 distinct()
## # A tibble: 153 x 4
##    rsid     DISEASE.TRAIT                             risk_allele your_geneotype
##    <chr>    <chr>                                     <chr>       <fct>         
##  1 rs734999 Asthma                                    C           CT            
##  2 rs301806 Allergic disease (asthma, hay fever or e… T           CT            
##  3 rs12932… Asthma                                    G           GG            
##  4 rs22285… Asthma                                    T           GT            
##  5 rs48456… Asthma                                    G           GG            
##  6 rs41292… Asthma                                    T           TT            
##  7 rs903361 Asthma                                    A           AG            
##  8 rs903361 Asthma                                    G           AG            
##  9 rs10174… Allergic disease (asthma, hay fever or e… G           GG            
## 10 rs232542 Asthma (time to childhood onset) x early… C           CC            
## # … with 143 more rows
filter(mySNPs, rsid == "rs1695")
## # A tibble: 1 x 4
##   rsid   chromosome position genotype
##   <chr>  <fct>         <int> <fct>   
## 1 rs1695 11         67352689 AA

normal asthma risk in certain populations

filter(mySNPs, rsid == "rs2303067")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>

No asthma and atopic dermatitis SNP

filter(mySNPs, rsid == "rs4794067")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs4794067 17         45808828 CC

2.1x risk for Aspirin Induced Asthma. But possibly lower risk of lupus and intractable Graves’ disease.

The following SNPs are all associated with increased asthma risk if exposed to smoke ~3x increased asthma risk if exposed to smoke

filter(mySNPs, rsid == "rs2305480")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs2305480 17         38062196 GG

~3x increased asthma risk if exposed to smoke

filter(mySNPs, rsid == "rs4795400")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>

no SNP present

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>% 
 filter(str_detect(tolower(DISEASE.TRAIT), "memory")) %>%
 distinct()
## # A tibble: 10 x 4
##    rsid      DISEASE.TRAIT                            risk_allele your_geneotype
##    <chr>     <chr>                                    <chr>       <fct>         
##  1 rs726296… Short-term memory (digit-span task)      C           CC            
##  2 rs9004    Logical memory (immediate recall) in no… T           CT            
##  3 rs112379… Verbal declarative memory                T           TT            
##  4 rs110747… Verbal declarative memory                T           TT            
##  5 rs429358  Logical memory (immediate recall)        C           CT            
##  6 rs429358  Logical memory (delayed recall)          C           CT            
##  7 rs429358  Age-related cognitive decline (memory) … C           CT            
##  8 rs4420638 Verbal declarative memory                G           AG            
##  9 rs6046393 Verbal declarative memory                T           CT            
## 10 rs1010304 Verbal declarative memory                A           AA
filter(mySNPs, rsid == "rs4680")
## # A tibble: 1 x 4
##   rsid   chromosome position genotype
##   <chr>  <fct>         <int> <fct>   
## 1 rs4680 22         19951271 GG

Warrior: Val, less exploratory, higher COMT enzymatic activity, therefore lower dopamine levels; higher pain threshold, better stress resiliency, albeit with a modest reduction in executive cognition performance under most conditions

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>% 
 filter(str_detect(tolower(DISEASE.TRAIT), "nicotine")) %>%
 distinct()
## # A tibble: 12 x 4
##    rsid       DISEASE.TRAIT                           risk_allele your_geneotype
##    <chr>      <chr>                                   <chr>       <fct>         
##  1 rs1060061  Nicotine dependence                     T           CT            
##  2 rs4668485  Nicotine dependence symptom count       T           CT            
##  3 rs9379896  Nicotine dependence symptom count       C           CC            
##  4 rs62392942 Nicotine dependence                     T           TT            
##  5 rs4132568  Nicotine glucouronidation               A           AA            
##  6 rs11763343 Nicotine dependence symptom count       A           AG            
##  7 rs4285401  Nicotine use                            A           AG            
##  8 rs7385760  Nicotine dependence symptom count       T           CT            
##  9 rs10828623 Nicotine dependence symptom count       T           CT            
## 10 rs16969968 Fagerstr**m test for nicotine dependen… G           AG            
## 11 rs8075300  Nicotine dependence symptom count       C           CC            
## 12 rs2836823  Nicotine dependence                     T           CT
filter(mySNPs, rsid == "rs3750344 ")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>
filter(mySNPs, rsid == "rs1051730 ")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>

No nicotine dependence alleles

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(DISEASE.TRAIT, risk_allele = risk_allele_clean, your_geneotype = genotype) %>% 
 filter(str_detect(tolower(DISEASE.TRAIT), "drug")) %>%
 distinct()
## # A tibble: 18 x 3
##    DISEASE.TRAIT                                      risk_allele your_geneotype
##    <chr>                                              <chr>       <fct>         
##  1 Medication use (drugs used in diabetes)            T           CT            
##  2 QT interval (drug interaction)                     T           CT            
##  3 Cough in response to angiotensin-converting enzym… C           CT            
##  4 Drug-induced Stevens-Johnson syndrome or toxic ep… C           CC            
##  5 Drug-induced Stevens-Johnson syndrome or toxic ep… G           GG            
##  6 Drug-induced Stevens-Johnson syndrome or toxic ep… A           AA            
##  7 Adverse response to chemotherapy (neutropenia/leu… A           AG            
##  8 Adverse response to chemotherapy (neutropenia/leu… G           AG            
##  9 Liver injury in anti-tuberculosis drug treatment   A           AA            
## 10 Medication use (drugs used in diabetes)            A           AG            
## 11 Medication use (drugs used in diabetes)            G           GT            
## 12 Medication use (drugs used in diabetes)            T           TT            
## 13 QT interval (drug interaction)                     A           AG            
## 14 Adverse response to chemotherapy (neutropenia/leu… C           CT            
## 15 Adverse response to chemotherapy (neutropenia/leu… T           CT            
## 16 Adverse response to chemotherapy (neutropenia/leu… G           GG            
## 17 Illicit drug use                                   G           AG            
## 18 QT interval (drug interaction)                     A           AA

Ashkenazi related alleles

filter(mySNPs, rsid == "rs11209026")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs11209026 1          67705958 GG

higher risk for certain autoimmune diseases.

filter(mySNPs, rsid == "rs11209026")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs11209026 1          67705958 GG

higher risk for certain autoimmune diseases.

filter(mySNPs, rsid == "rs386833395")
## # A tibble: 1 x 4
##   rsid        chromosome position genotype
##   <chr>       <fct>         <int> <fct>   
## 1 rs386833395 17         41276045 II
filter(mySNPs, rsid == "rs80357906")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs80357906 17         41209083 DD

no BRCA1 variants

filter(mySNPs, rsid == "rs80359550")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs80359550 13         32914438 II

no BRCA2 variant

not a carrier for cyctic fibrosis

filter(mySNPs, rsid == "rs121965064")
## # A tibble: 1 x 4
##   rsid        chromosome  position genotype
##   <chr>       <fct>          <int> <fct>   
## 1 rs121965064 4          187201412 TT
filter(mySNPs, rsid == "rs373297713")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>
filter(mySNPs, rsid == "rs121965063")
## # A tibble: 1 x 4
##   rsid        chromosome  position genotype
##   <chr>       <fct>          <int> <fct>   
## 1 rs121965063 4          187195347 GG

not a carrier of hemophilia C (1/23 Ashkenazi are carriers

filter(mySNPs, rsid == "rs111033171")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>
filter(mySNPs, rsid == "rs137853022")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>
filter(mySNPs, rsid == "rs28939712")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>

not a Familial dysautonomia carrier

Checking for random SNPs

filter(mySNPs, rsid == "rs333")
## # A tibble: 0 x 4
## # … with 4 variables: rsid <chr>, chromosome <fct>, position <int>,
## #   genotype <fct>

No resistance to HIV

filter(mySNPs, rsid == "rs662799")
## # A tibble: 1 x 4
##   rsid     chromosome  position genotype
##   <chr>    <fct>          <int> <fct>   
## 1 rs662799 11         116663707 AG

1.4x higher early heart attack risk; less weight gain on high fat diets

filter(mySNPs, rsid == "rs7495174")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs7495174 15         28344238 AA

blue/gray eyes more likely

filter(mySNPs, rsid == "rs12913832")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs12913832 15         28365618 GG

blue eye color, 99% of the time

filter(mySNPs, rsid == "rs1799971")
## # A tibble: 1 x 4
##   rsid      chromosome  position genotype
##   <chr>     <fct>          <int> <fct>   
## 1 rs1799971 6          154360797 AA

No stronger alcohol cravings

filter(mySNPs, rsid == "rs4988235")
## # A tibble: 1 x 4
##   rsid      chromosome  position genotype
##   <chr>     <fct>          <int> <fct>   
## 1 rs4988235 2          136608646 AA

Can digest lactose

filter(mySNPs, rsid == "rs590787")
## # A tibble: 1 x 4
##   rsid     chromosome position genotype
##   <chr>    <fct>         <int> <fct>   
## 1 rs590787 1          25629943 AG

Rh +. I knew I was type A, now I know Im A+

filter(mySNPs, rsid == "rs4675690")
## # A tibble: 1 x 4
##   rsid      chromosome  position genotype
##   <chr>     <fct>          <int> <fct>   
## 1 rs4675690 2          208507807 TT

show less disgust

filter(mySNPs, rsid == "rs1015362")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs1015362 20         32738612 CT

2-4x higher risk of sun sensitivity if part of risk haplotype.

filter(mySNPs, rsid == "rs4911414")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs4911414 20         32729444 GT

2-4x higher risk of sun sensitivity if part of risk haplotype

filter(mySNPs, rsid == "rs12821256")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs12821256 12         89328335 TT

no additional likelyhood of blonde hair

filter(mySNPs, rsid == "rs12203592")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs12203592 6            396321 CT

likely presence of freckles, brown hair and high sensitivity of skin to sun exposure.

filter(mySNPs, rsid == "rs35264875")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs35264875 11         68846399 AT

one blonde variant

filter(mySNPs, rsid == "rs12896399")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs12896399 14         92773663 TT

Lighter hair color & blue eyes more likely

filter(mySNPs, rsid == "rs1042522")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs1042522 17          7579472 CC

Live 3 years longer. Chemotherapy is more effective.

filter(mySNPs, rsid == "rs6968865")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs6968865 7          17287269 TT

Associated with (slightly) increased coffee consumption

Drug Metabolism

filter(mySNPs, rsid == "rs4986893")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs4986893 10         96540410 GG
filter(mySNPs, rsid == "rs28399504")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs28399504 10         96522463 AA
filter(mySNPs, rsid == "rs41291556")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs41291556 10         96535173 TT

normal metabolizer of several commonly prescribed drugs

filter(mySNPs, rsid == "rs12248560")
## # A tibble: 1 x 4
##   rsid       chromosome position genotype
##   <chr>      <fct>         <int> <fct>   
## 1 rs12248560 10         96521657 CT

ultra fast metabolizer of proton pump inhibitors and benefit from tamoxifen treatment; drug metabolism effects; also 0.77x decreased breast cancer risk

filter(mySNPs, rsid == "rs8099917")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs8099917 19         39743165 GT

Moderately lower odds of responding to PEG-IFNalpha/RBV treatment (Hepatitis C treatments)

filter(mySNPs, rsid == "rs1057910")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs1057910 10         96741053 AC

average 40% reduction in warfarin metabolism (1/2 SNPs)

filter(mySNPs, rsid == "rs1800460")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs1800460 6          18139228 CT

impaired capability of detoxifying byproducts of certain drugs (antineoplastic and immunosuppressant)

filter(mySNPs, rsid == "rs1800462")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs1800462 6          18143955 CC

incapable of detoxifying certain drugs (antineoplastic and immunosuppressant)

filter(mySNPs, rsid == "rs1142345")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs1142345 6          18130918 CT

impaired drug metabolism (antineoplastic and immunosuppressant)

filter(mySNPs, rsid == "rs11212617")
## # A tibble: 1 x 4
##   rsid       chromosome  position genotype
##   <chr>      <fct>          <int> <fct>   
## 1 rs11212617 11         108283161 AC

Somewhat increased likelihood of treatment success with metformin (helps with diabetes which I have increase chance of)

filter(mySNPs, rsid == "rs2395029")
## # A tibble: 1 x 4
##   rsid      chromosome position genotype
##   <chr>     <fct>         <int> <fct>   
## 1 rs2395029 6          31431780 TT

no increase risk for drug-induced liver injury when prescribed flucloxacillin

filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, your_genotype = genotype, strongest_risk_allele = risk_allele_clean, DISEASE.TRAIT, STUDY) 
## # A tibble: 23,142 x 5
##    rsid    your_genotype strongest_risk_a… DISEASE.TRAIT    STUDY               
##    <chr>   <fct>         <chr>             <chr>            <chr>               
##  1 rs1126… CT            C                 IgG glycosylati… Loci associated wit…
##  2 rs2803… AA            A                 Body mass index  Meta-analysis of ge…
##  3 rs2803… AA            A                 Body mass index  Meta-analysis of ge…
##  4 rs4252… CT            T                 Height           Hundreds of variant…
##  5 rs1079… CT            C                 Ulcerative coli… Host-microbe intera…
##  6 rs7349… CT            C                 Ulcerative coli… Meta-analysis ident…
##  7 rs7349… CT            C                 Asthma           Genome-wide analysi…
##  8 rs3748… AG            A                 Primary scleros… Genome-wide associa…
##  9 rs3748… AG            A                 Primary scleros… Dense genotyping of…
## 10 rs3890… CT            T                 Rheumatoid arth… Common variants at …
## # … with 23,132 more rows
datatable(
 filter(mySNPs_gwas_table, have_risk_allele_count >= 1) %>%
 select(rsid, your_genotype = genotype, strongest_risk_allele = risk_allele_clean, DISEASE.TRAIT, STUDY )
)
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
datatable(
 filter(mySNPs_gwas_table,have_risk_allele_count > 0 & (str_detect(tolower(INITIAL.SAMPLE.SIZE), "european") | str_detect(tolower(REPLICATION.SAMPLE.SIZE), "european")) & (RISK.ALLELE.FREQUENCY > 0 & !is.na(RISK.ALLELE.FREQUENCY))) %>%
 arrange(RISK.ALLELE.FREQUENCY) %>%
 select(rsid, your_genotype = genotype, DISEASE.TRAIT, INITIAL.SAMPLE.SIZE,RISK.ALLELE.FREQUENCY)
 )
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
trait_entry_count <- group_by(mySNPs_gwas_table, DISEASE.TRAIT) %>%
 filter(have_risk_allele_count >= 1) %>%
 summarise(count_of_entries = n())

ggplot(filter(trait_entry_count, count_of_entries > 100), aes(x = reorder(DISEASE.TRAIT, count_of_entries, sum), y = count_of_entries)) +
 geom_col() +
 coord_flip() +
 theme_bw() +
 labs(title = "Which traits I have the risk allele for\nthat have over 100 entries in the GWAS database?", y = "Count of entries", x = "Trait")

# Summarise proportion of SNPs for a given trait where you have a risk allele
trait_snp_proportion <-  filter(mySNPs_gwas_table, risk_allele_clean %in% c("C" ,"A", "G", "T") & my_allele_1 %in% c("C" ,"A", "G", "T") & my_allele_2 %in% c("C" ,"A", "G", "T") ) %>%
mutate(you_have_risk_allele = if_else(have_risk_allele_count >= 1, 1, 0)) %>%
 group_by(DISEASE.TRAIT, you_have_risk_allele) %>%
 summarise(count_of_snps = n_distinct(rsid)) %>%
 mutate(total_snps_for_trait = sum(count_of_snps), proportion_of_snps_for_trait = count_of_snps / sum(count_of_snps) * 100) %>%
 filter(you_have_risk_allele == 1) %>%
 arrange(desc(proportion_of_snps_for_trait)) %>%
 ungroup()
## `summarise()` has grouped output by 'DISEASE.TRAIT'. You can override using the `.groups` argument.
trait_study_count <- filter(mySNPs_gwas_table, risk_allele_clean %in% c("C" ,"A", "G", "T") & my_allele_1 %in% c("C" ,"A", "G", "T") & my_allele_2 %in% c("C" ,"A", "G", "T") ) %>%
 group_by(DISEASE.TRAIT) %>%
 summarise(count_of_studies = n_distinct(PUBMEDID), mean_risk_allele_freq = mean(RISK.ALLELE.FREQUENCY))


trait_snp_proportion <- inner_join(trait_snp_proportion, trait_study_count, by = "DISEASE.TRAIT")

ggplot(filter(trait_snp_proportion, count_of_studies > 1 & proportion_of_snps_for_trait > 70), aes(x = reorder(DISEASE.TRAIT, proportion_of_snps_for_trait, sum), y = proportion_of_snps_for_trait, fill = mean_risk_allele_freq)) +
 geom_col() +
 coord_flip() +
 theme_bw() + 
 labs(title = "Traits I have more than half of the risk\nalleles studied where > 1 studies involved", 
 y = "% of SNPs with risk allele", x = "Trait", fill = "Mean risk allele frequency") 

datatable(trait_snp_proportion)
datatable(
 filter(mySNPs_gwas_table,have_risk_allele_count > 0 & (str_detect(tolower(INITIAL.SAMPLE.SIZE), "european") | str_detect(tolower(REPLICATION.SAMPLE.SIZE), "european")) & (RISK.ALLELE.FREQUENCY > 0. & !is.na(RISK.ALLELE.FREQUENCY))) %>%
 arrange(RISK.ALLELE.FREQUENCY) %>%
 select(rsid, your_genotype = genotype, DISEASE.TRAIT, INITIAL.SAMPLE.SIZE,RISK.ALLELE.FREQUENCY)
 )
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
datatable(
 filter(mySNPs_gwas_table, have_risk_allele_count == 2) %>%
 select(rsid, your_genotype = genotype, strongest_risk_allele = risk_allele_clean, DISEASE.TRAIT, STUDY )
)
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html

8. Recommendations for next steps

I should probably see a gastroenterologist regarding my already increased susceptibility of certain gastrointestinal conditions being Ashkenazi, but my testing results reaffirmed the fact that I should visit a gastroenterologist and most likely get further testing done. I was not aware that I was predisposed to type-2 diabetes which does not require mentioning to my medical provider but I should take into consideration in my lifestyle choices. Furthermore, what I discovered regarding drug and medication metabolism shocked me, I was not at all aware that I had so many SNPs associated with drug metabolism incapabilities. That is something I will most definitely inform my medical provider of. Additionally, purely for curiosity reasons I would like to have my mother genetically tested because her family history is such a mystery.